Appendix: Tech for creating, editing and collaborating on this ‘Bookdown’ web book/project (and starting your own)

This tutorial written by Oska Fentem with David Reinstein

Introduction

This appendix provides a brief introduction to the several types of software and processes used to creating websites such as Increasing Effective Charitable Giving and Researching and writing for Economics students. We aim to encourage others to participate in this collaborative work, and to spin off their own projects. If you would like to provide feedback or ask a question about these projects then using ‘hypothes.is’ is an easy way to do so.

The template for my bookdown projects is maintained in my repo here

This site (web-book project) is

Hosted:

  • Hosted on Github (Github pages)
  • A project managed out of a Git repo stored in Github


The content is:

  • A ‘Bookdown’ (in the ‘Gitbook’ style, although we’ve drawn elements from the Tufte style)
  • …which is a hosted collection of HTML (and other) files…
  • …constructed/compiled/built from R-Markdown (.Rmd) files and other support files using the R language


This relies heavily on:

  • ‘Markdown syntax’ for basic writing/formatting
  • Latex for mathematics notation
  • Bibtex for references/citations
  • ‘Pandoc’ to convert between different document formats
  • CSS (style sheets)


To build this, we chose to use tools and software including:

  • The RStudio environment for working with R code
  • Github desktop to manage pushing/pulling and integrating content (although sometimes we use raw Git)
  • Features of the GitHub website such as ‘projects’

We first give a brief overview of R & RStudio, Git & Github, and R Markdown & Bookdown, linking more extensive further resources/tutorials.

Git and Github

  • Git is a version control system which enables users to track changes and progress in coding projects or any files in general. It is particularly useful for collaborating on projects as it provides a useful way to show who has altered which files and when. Users are even able to clone a repository (a folder inside of a project which tracks all changes made) and make changes without affecting the original project. Git also provides a very simple way to keep changes to projects up to date across different operating systems such as Windows and Mac. Installation and configuration of Git can be confusing to the newly-initiated user, Happy Git provides a user friendly tutorial on installing Git, which can be downloaded here.

Getting a Github account should take about XXX minutes.

Here’s a guide to exactly how to do it.

Installing Git and the GitHub Desktop should take about XXX minutes.

Here’s a guide to exactly how to do it.

Some key things to know about Git and GitHub

Git and GitHub can be a bit confusing. Here are some things that I wish I had known, that took my a while to figure out (unfold)

  • Git and Github are not the same thing … (explain)

  • A ‘commit’ does not actually change the files in the shared (remote) Github repo; you need to ‘push’ to do that

  • After ‘pulling’ from the remote repo, you may need to merge changes… (explain)

  • You can have several different ‘branches’ of the same Repo existing at the same time. When you switch to a new ‘branch’ the files you see on your computer will instantly and amazingly change to exactly the files in that branch. But don’t worry, the old branch is not lost.

  • … add some more


A brief overview of key functions inside Git (assuming a remote Github repo) including commits, pushes & pulls, forks & branches and pull requests: (unfold)

  • A commit saves the changes made in the current document to the local repository. Specific changes to commit to the remote (online) repo must be specified. This process is made much easier using a program such as Github Desktop rather than the Git code itself (although they do the same thing, and the latter is more flexible).
  • A push, pushes all local commits to the online version of this repository, essentially updating the online version of the files, to the version which is stored locally on your device.
  • A pull, is used to pull the changes made to the online repository, into the local repository. Thus making the local repository up to date with the remote/online repository.
  • Creating a branch allows you to create a separate version of a repository and make changes to this without affecting the master/original repository.
  • A pull request then allows you to pull the changes made in a branch over to the master repository, in order to merge the work.
  • As noted, Github Desktop provides a user interface for a more simple and intuitive way to use Git. There are a variety of other interfaces.
    • Github can also be integrated into RStudio and into many other tools, such as the Atom text editor.
  • Repos that are stored on Github can be accessed via a browser at github.com. The Github website itself provides a wide variety of tools, discussed further below under ‘GitHub web page’,

R and RStudio

  • R is a free programming language which is mainly used for data analysis and statistics. It can be downloaded here. The popularity of R is growing in Economics Academia, largely due to the growth of Machine Learning techniques in R as well as the flexibility of the language itself. R makes use of packages which are a collection of functions written in order to achieve specific tasks. Whilst R comes pre-installed with a variety of useful packages, it is often useful to install more, which can be done using the install.packages command.

    If you are familiar with Python, these R packages are roughly comparable to Python’s modules.

Installing R should take about XXX minutes.

Here’s a guide to exactly how to do it.
  • RStudio is a programming environment and interface which helps facilitate a variety of tasks such as writing scripts using R (as well as other languages), and building/knitting these into various document formats. RStudio ‘Addins’ can also be extremely useful for things like tracking ‘todos’, adding citations, and formatting code. RStudio can also be configured so as to work seamlessly with Git (more on this later). RStudio can be downloaded here

Installing and configuring RStudio should take about XXX minutes.

Here’s a guide to exactly how to do it.

Markdown and Bookdown

  • Markdown is a popular set of formats (really a ‘syntax for specifying output’) for generating and authoring documents. The Rmarkdown format (rmarkdown package) is one flavor of Markdown that works with R to enable ‘dynamic documents’ involving text, data-analysis, and other elements. It can then export your work to a variety of outputs such as html, pdf and word documents. As well as this it can also be used to create webpages, such as the one you are currently reading. The power of Markdown files comes from the way that they are able include/embed code as well as data and tables, which is useful for writing reproducible research and creating websites.

  • The Bookdown package was built on the Rmarkdown package, but it adds many features to enable larger and more structured output, particularly ‘web books’ and web sites. As we use it, this these books combine multiple Rmarkdown files, with each such ‘Rmd’ file becoming its own HTML page.

Look at the list of headings on the left of this page: each second-level header is it’s own web-page (a distinct html link). “All the content in one scrolled page” is limited to a single first-level header.

The code and folder structure in this repo, and what it means

Writing_econ_book: Files-folders of interest (taken from readme March 2020)

docs: html output put here for web hosting

Folder: writing_econ_book

  • bookdown.yml: determines which files are included in the book
  • writing_econ_gfm.Rmd: The main content; body of the book (many chapters)
  • index.Rmd: Setup content and some styling/parameters; determines how the book is built (into which format, etc)
  • header_include.html: Important commands included here including folding boxes
  • references_cut.bib: bibtex references referred to in ‘@ref’ notes
  • tufte_plus.css: Determines layout and styling
  • writing_econ_book.Rproj: ‘project’ … to work on this in R-studio

The code in a single “.Rmd” file and how it translates into content

Basic (R-)Markdown

The Markdown format offers a simple plain-text notation for specifying the elements of documents, reports, web sites, etc. (It is much simpler and easier to read than html, latex, etc.) It is widely used by programmers, on comment boards/forums, and throughout the internet. For example, GitHub.com automatically renders markdown code, particularly in readme.md files.

Actually there are several varieties of markdown, but they mainly share key elements.

Markdown documents are usually saved as plain text files with the extension .md, e.g., report.md. These allow for an easy way to create a variety of outputs, particularly reports and text-focused web pages. The markdown format is converted into other formats (html, latex, etc.) with a variety of tools, particularly something called Pandoc.

What is Pandoc?

Pandoc is a tool (a program) for converting from one document format to another. It is incredibly powerful. The great thing about a format like markdown, or r-markdown, is that it is simple to write and peruse, and, with the help of Pandoc, it can convert into many many other useful formats for web pages, documents, presentations, etc.

Pandoc is built into other tools including the RMarkdown package (see discussion on Stackexchange here).

You can also install and use Pandoc directly in the command line, or try it out (in a limited but still useful way) on the web here

For more on Pandoc visit pandoc.org

In the R (statistically focused) language there are tools such as knitR that allow R users to produce reports combining text, statistical output, and interactive content. These are generally written in “R-markdown” documents, saved as .Rmd rather than .md files. The R-studio interface, and several “add-ins”, also help facilitate this. This interface is very useful; in fact, it may be convenient to build web books and other content using this even if you are not planning to extensively integrate R code and data. (As in the present book, although I’m hoping to build this in).

Using R-markdown and Knitr (and other tools and add-ins like ‘Bookdown’) content from multiple sources can easily be embedded into these documents allowing users to easily display objects such as plots or regression output.

Some simple markdown rules

Text can be made italic using single asterisks *italic or bold by using asterisks **bold**.

Hashtags/pound signs (#) specify headers and subheaders, e.g., this third-level subsection header was created with the code:

### Some simple markdown rules {#simple-md-rules}

Where the bit in the curly braces allows us to link-back with the code [link back text whatever](#simple-md-rules) … rendering as link back text whatever.

Other key features are ordered lists and unordered lists:

- unordered first entry
- unordered second entry
     - subelement of second entry

While basic markdown has a limited set of rules, there are many more formatting and content options for documents produced in (R)-Markdown, far too many to detail here. These may combine markdown code, html code, latex code, and more. The following cheatsheets are very useful for writing (R)-markdown documents:

Markdown documents allow for an easy way to write reports. Content from multiple sources can easily be embedded into these documents allowing users to easily display objects such as plots or tables of data.

Text can be made italic using single asterisks *italic* and bold by using double asterisks **bold**.

There are various text formatting options in Markdown, far too many to detail here… The following cheatsheets are very useful for writing markdown documents:


See also (most useful, but highly detailed):

Code chunks provide an easy way to embed code into your R Markdown files. The code language is not just limited to R either, as other languages can be used. This means that there is a wide variety of content which can be displayed in a chunk. Such as tables of data:

##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Code chunks are defined by wrapping text inside ``` ```. The above example was coded using:

```{r}
head(iris)
```

Options can be specified inside of the curly brackets {} More information is provided here

Inline code

Inline code is a quick and easy way to put snippets of R code. As an alternative to using code chunks, R code can simply be placed inside of `r `. For example, this can be used as an easy way to insert the value of a variable into a paragraph without inserting a chunk.

Latex/maths

R Markdown also can make use of the LaTeX document preparation system, which is popular for writing technical documents with mathematical content. This allows us to publish documents which include equations such as:

\[y = \beta_0+\beta_1x_1 +\beta_2x_2+...+\beta_kx_k+u\]

Which is written using $$y = \beta_0+\beta_1x_1 +\beta_2x_2+...+\beta_kx_k+u.$$. Using $$ means that the equation will be centered on the page. Alternatively $ can be used in the same way, without the centering.

A very useful guide to maths in R Markdown

Custom styles

Bookdown allows for users to build their own custom styles in order to change the appearance of documents. To create styles for HTML projects a custom css file is used. For these projects, styles are contained in support/tufte_plus.css. To use a defined style, the user can specify options at the start of a chunk, or using a HTML wrapper as show below for margin notes.

  • More on creating styles here. Below will outline several key styles used throughout these projects.

‘Notes’

A very useful guide to maths in R Markdown provides a detailed outline of the various mathematical symbols which can be used.

Custom styles

Bookdown allows for users to build their own custom styles in order to change the appearance of documents. To create styles for HTML projects a custom css file is used. For these projects, styles are contained in tufte_plus.css. To use a defined style, the user can specify options at the start of a chunk, or using a HTML wrapper as shown below for margin notes.

More on creating styles here. There are 3 main custom styles which are used throughout the projects:

  • ‘Notes’
Formatted ‘Notes’ have been defined in this work which allow text to be placed in coloured blocks such as this one. To use this note style, {block2, type ='note'} can be specified at the start of the block, or a HTML wrapper can be used. This assigns the .note formatting from the tufte_plus.css file to the chunk.
  • Margin notes Margin notes are used throughout these projects as a way of displaying information in an organised and aesthetically pleasing way. To add a margin note, text is placed inside the following HTML wrapper:

The margin notes used in this project are inspired by the Tufte handout style developed by American statistician Edward Tufte.

<div class="marginnote">
Your margin note goes here.
</div>

Or margin notes can be added by using chunk options.

  • Folding boxes
Folding boxes also provide a useful way to incorporate content without cluttering the page. Similarly to the ‘notes’ the folding boxes are defined in tufte_plus.css and called by specifying {block2, type='fold'} at the start of a chunk, or using a HTML wrapper.

Adding references/citations

As with any academic work, it is always important to reference sourced material. Across these projects the following software is used:

  • Setup

Pandoc provides a way to generate formatted references as well as a bibliography in Markdown.

(This is done through Pandoc tools.)

Oska, are you sure about this?

The bibliography file to be sourced is specified within ‘YAML’ content, which guides the processing of these documents. (YAML content is generally enclosed with a three-dash --- break at top and bottom.)

I generally specify the bibliography source in the YAML at the top of the .Rmd file, or for Bookdown project,s in the the YAML content in index.Rmd. (Y

@Oska – we should try to ecplain this yaml stuff a bit better.

  • BibTeX

The BibTeX format refers to a stylized file format which is used predomoninantly for lists of references, mainly and originally for working with latex.. BibTeX bibliographies use the .bib extension. For example the bibliography for this project is giving_keywords.bib. For more information on BibTeX see here

  • Citr package (addin) for RStudio

The Citr package provides functions to search Zotero and BibTeX libraries in order to insert references into Markdown files. Citr also features a plugin for RStudio which makes the referencing process even easier. Instructions for download, as well as a demonstration of the Rstudio plugin are provided here.

  • Zotero

Zotero is a free open source reference manager, which enables users to sync their library of references across multiple devices. Similarly to other reference managers, Zotero offers plugins for popular browsers such as Chrome and Safari. Download Zotero

  • Better BibTeX for Zotero

Better BibTeX for Zotero is a add-on for Zotero. Among other things it allows the Zotero library to be exported from Zotero for use in Markdown. Installation instructions are provided here.

I currently (25 Apr 2020) am having Zotero automatically maintain/output the key .bib file to a dropbox folder. Each project has code to routinely download this file with a command such as

download.file(url = "https://www.dropbox.com/s/3i8bjrgo8u08v5w/reinstein_bibtex.bib?raw=1", destfile = "reinstein_bibtex.bib")

How to ‘build’ and view the book

One way is within RStudio

  • Be sure Github repo is synced so all files are present

  • Packages need to be installed, but this should (?) be done automatically when you build via the source(here("code", "baseoptions.R")) line in index.Rmd

    • knitr is a key package
  • Click ‘Build’, ‘Build all’, or the shortcut key shift-cmd-b

    • … this seems to run the command rmarkdown::render_site(encoding = 'UTF-8')

Building may take some time, depending on how much code is present in the Rmd files and what that code does

  • It puts all the Rmd files specified in the _bookdown.yml into a single file, here labeled barriers-to-effective-giving.knit.md (I think), and then tuns that into html, also invoking bibtex along the way

  • Depending on your RStudio settings (-Tools, -Project Options, -Build tools, -Preview book after building), it may put up a ‘preview version’ of the site

  • All the ‘new’ output is directed to be put in the ‘docs’ folder, a bunch of html files. You can view those ‘local’ files in any web browser

  • Once you commit and push, the ‘new’ bookdown website should be up on the WWW

Joining this project

  • Get a Github account, contact daaronr AT gmail.com and tell him your github account ID (or the email you used to join should probably work as well)

  • Remember to ‘accept’ the invitation to the repos (here, the EA_giving_barriers repo; and possibly some other supporting repos as well). You should receive this invitations via email and it should also be in your “notifications” on Github.

Creating a Branch and a ‘pull request’

GitHub web page content {#}

As noted, GitHub is a web page and interface that acts as an external server and storage space for git projects/repos. It works well with this and also incorporate several additional features. You can see it and even interact with much of a repo simply via the GitHub webpage without even installing Git (but I strongly recommend that you do install Git as well as a tool like GitHub Desktop, unless you want to solely rely on command line Git).

Web page for a repo

EA barriers repo github starting page In your account when you click on the repo you’ll see something like the screen above. There are many tabs, starting with the code tab. At the top of this, you will see the list of folders and files, with messages describing the latest comments.

Note here we also see:

  • 48 ‘commits’

  • 2 ‘branches’

  • 2 contributors

Below this, some options allowing you to switch branch, manually upload files, clone or download etc.

Once you’ve installed Git you will want to ‘clone this repo’ to have it on your machine and to be able to easily work with it and commit and push and pull changes. You will do this clone either via the web site, GitHub Desktop or another application, or using the command line

Readme for a repo

Below the list of files you should see the “readme” for this repository. This is a file ‘README.md’ stored in the root directory of this repository. If you click on it or look at the file you’ll see it is written in markdown syntax but the GitHub website renders it into a nice format.

I typically use this readme to explain what the project is about and describe (and link) the folder structure.

Comments/notifications

In a variety of places within a repo when you are adding comments or content you can refer to a collaborator who will then receive a “notification” linking this content. (These are also called “callouts” in some systems.) These may come as as emails to that collaborator if they set a setting to get email notifications, but they will definitely appear as a notification, again that bell thing in the upper right hand corner.

Seeing recent commits, history and ‘blame’

Showing the most recent commits

Showing the most recent commits


Above, this shows the most recent commits.

Clicking on one of these commits will show you ‘what changed’ and old versus new versions.


Showing the most recent commits

Showing the most recent commits

For example, above we see something like a “split diff” view, with the ‘old version’ (before this commit) on the left and the ‘new version’ on the right. What is new is in green (with a ‘+’), and what is removed is in red highlight (with a “-”).

Here we see that …

  • in the file ‘sections/inertia.rmd’ a space has been added after ‘crowding out?’

  • in ‘present_puzzle.Rmd’ an (obsolete) ‘underline’ notation has been replaced with a third level markdown header (three # marks)

This is but one way to view and consider changes. Various text editors such as Atom and ViM also offer great tools, as does the Git program itself and the GitHub desktop application.

Commenting within commits, etc., tagging collaborators in this

One way to ask questions, comment on changes and let people know about changes you made, is via adding a comment within a commit itself. (Check: can this be tied to an ‘issue’?)

Commenting on a part of a commit, notifying a collaborator

Commenting on a part of a commit, notifying a collaborator

Above, we see that by clicking on a plus sign that appears just to the right of the the line number when viewing a commit, we can add a comment on that particular part of the commit. We can then flag another collaborator (see ‘@daaronr’ above … when you type the ‘@’ you get a dropdown of collaborators) who will be notified of this.

This mode of commenting and conversation has the advantage of avoiding cluttering up the actual code and text with excess comments.


You can also link each comment to an ‘issue’ (issues are discussed below) by adding a hash to the comment and citing the issue number. This comment and link to the place in the code or text for that commit will then show up when you look at that issue. This makes the discussion more organized, at least we hope.

Link an issue in a comment

Link an issue in a comment

The ‘Project’ board and ‘Issues’

  • the Project is a ‘Kanban board’ for managing tasks, responsibilities and progress

  • these should be entered as ‘issues’, enabling assignments and further discussion within the ‘issues’ pages


a github ‘Project’

a github ‘Project’

Kanban for the github ‘Project’

Kanban for the github ‘Project’

One task/issue in the Kanban

One task/issue in the Kanban

Viewing this issue and its discussion

Viewing this issue and its discussion

Airtable and innovationsinfundraising.org

This project is closely connected to innovationsinfundraising.org. Much of these projects overlap, and there is a shared ‘database’ stored as an airtable Giving researchers shared

We had an earlier … tutorial on using the Airtable and Innovationsinfundraising.org here

I add a few more points below, more relevant to the current project:

Airtable

Airtable is a collaborative web-based software with a variety of displays and organizational structures; it has many features of a relational database, and even more features if one engages their API. It is user-friendly, with a gui resembling a spreadsheet, and easy tutorials, instructions and examples. You can operate it from a browser or a web-driven app.

Key features of tables in Airtables (quick views)

Each Airtable user can have any number of Bases, and bases can be shared in work groups.

  • Command-K to jump to any other Base

“key_papers”; the papers providing the most relevant and strongest evidence for the tool, and “secondary papers”.

Key content

The “Categories” table provides and explains a number of “schema” we use to characterize both the tools and the “Barriers to effective giving” (discussed later).

categories table

categories table


Key papers are stored and organised in the ‘papers_mass’ table. This is crosslinked in several other tables. Within each paper ‘row’ there is a variety of relevant information and discussion on each paper.

key papers

key papers


The fundingwiki app automatically populates and updates information on the number of times each paper has been cited, using the Crossref database. Tool such as these will enable this to be a perrennial resource, rather than a frozen-in-time evaluation.

citations auto update

citations auto update


Note: Some but not all of the Airtable content discussed in the rest of this subsection has already been incorporated into the present bookdown.

The table “EAlit_sections” outlines the (earlier?) structure of the EA barriers paper, already providing links to information that will be integrated.

organizing ealitpaper

organizing ealitpaper

This table also links directly to the papers_mass table, organizing the papers we are referencing and reviewing in each section.

organizing EAlit papers

organizing EAlit papers


The separate “Barriers to EAG” table is below.

This organizes and assembles the discussion and evidence on potential factors and categories of factors that may explain the limited amount of “effective giving”. This represents the largest part of our review paper; we focus on clear definitions of the most relevant psychological (and “behavioral economic”) biases, and carefully asses the available evidence. We focus specifically on evidence in the charitable domain, but we also consider the broader evidence for these biases in other contexts.

barriers to EAG

barriers to EAG

Again, this is older work, maybe already incorporated into the Bookdown?

For each barrier or bias, we consider why it is may be particularly relevant to effective giving.

barriers to EAG, why relevant

barriers to EAG, why relevant


We further propose and discuss tools addressing these barriers and promoting effective charitable giving.

tools remedies

tools remedies


Useful resources

Like most things, when working with code the internet is your best friend. Listed below are several useful resources for learning about the material mentioned above: